28 research outputs found

    The emergence of knowledge exchange: an agent-based model of a software market.

    Get PDF
    We investigate knowledge exchange among commercial organisations, the rationale behind it and its effects on the market. Knowledge exchange is known to be beneficial for industry, but in order to explain it, authors have used high level concepts like network effects, reputation and trust. We attempt to formalise a plausible and elegant explanation of how and why companies adopt information exchange and why it benefits the market as a whole when this happens. This explanation is based on a multi-agent model that simulates a market of software providers. Even though the model does not include any high-level concepts, information exchange naturally emerges during simulations as a successful profitable behaviour. The conclusions reached by this agent-based analysis are twofold: (1) A straightforward set of assumptions is enough to give rise to exchange in a software market. (2) Knowledge exchange is shown to increase the efficiency of the marketAgent-based Computational Economics, adaptive behaviour, knowledge sharing, market efficiency

    Decentralized supply chain formation using max-sum loopy belief propagation

    Get PDF
    Supply chain formation is the process by which a set of producers within a network determine the subset of these producers able to form a chain to supply goods to one or more consumers at the lowest cost. This problem has been tackled in a number of ways, including auctions, negotiations, and argumentation-based approaches. In this paper we show how this problem can be cast as an optimization of a pairwise cost function. Optimizing this class of energy functions is NP-hard but efficient approximations to the global minimum can be obtained using loopy belief propagation (LBP). Here we detail a max-sum LBP-based approach to the supply chain formation problem, involving decentralized message-passing between supply chain participants. Our approach is evaluated against a well-known decentralized double-auction method and an optimal centralized technique, showing several improvements on the auction method: it obtains better solutions for most network instances which allow for competitive equilibrium (Competitive equilibrium in Walsh and Wellman is a set of producer costs which permits a Pareto optimal state in which agents in the allocation receive non-negative surplus and agents not in the allocation would acquire non-positive surplus by participating in the supply chain) while also optimally solving problems where no competitive equilibrium exists, for which the double-auction method frequently produces inefficient solutions. © 2012 Wiley Periodicals, Inc

    Agent-based simulation of lock-in dynamics in a duopoly

    Get PDF
    Lock-in is observed in real world markets of experience goods; experience goods are goods whose characteristics are difficult to determine in advance, but ascertained upon consumption. We create an agent-based simulation of consumers choosing between two experience goods available in a virtual market. We model consumers in a grid representing the spatial network of the consumers. Utilising simple assumptions, including identical distributions of product experience and consumers having a degree of follower tendency, we explore the dynamics of the model through simulations. We conduct simulations to create a lock-in before testing several hypotheses upon how to break an existing lock-in; these include the effect of advertising and free give-away. Our experiments show that the key to successfully breaking a lock-in required the creation of regions in a consumer population. Regions arise due to the degree of local conformity between agents within the regions, which spread throughout the population when a mildly superior competitor was available. These regions may be likened to a niche in a market, which gains in popularity to transition into the mainstream

    Domain adaptation for reinforcement learning on the Atari

    Get PDF
    Deep Reinforcement learning is a powerful machine learning paradigm that has had significant success across a wide range of control problems. This success often requires long training times to achieve. Observing that many problems share similarities, it is likely that much of the training done could be redundant if knowledge could be efficiently and appropriately shared across tasks. In this paper we demonstrate a novel adversarial domain adaptation approach to transfer state knowledge between domains and tasks on the Atari game suite. We show how this approach can successfully transfer across very different visual domains of the Atari platform. We focus on semantically related games that involve returning a ball with the user controlled agent. Our experiments demonstrate that our method reduces the number of samples required to successfully train an agent to play an Atari game

    Traffic3D:A new traffic simulation paradigm

    Get PDF
    The field of Deep Reinforcement Learning has evolved significantly over the last few years. However, an important and not yet fully-attained goal is to produce intelligent agents which can be successfully taken out of the laboratory and employed in the real-world. Intelligent agents that are successfully deployable in real-world settings require substantial prior exposure to their intended environments. When this is not practical or possible, the agents benefit from being trained and tested on powerful test-beds, effectively replicating the real-world. To achieve traffic management at an unprecedented level of efficiency, in this work, we demonstrate a significantly richer new traffic simulation environment; Traffic3D, a platform to effectively simulate and evaluate a variety of 3D road traffic scenarios, closely mimicking real-world traffic characteristics, including faithful simulation of individual vehicle behavior, precise physics of movement and photo-realism. In addition to deep reinforcement learning, Traffic3D also facilitates research in several other domains such as imitation learning, learning by interaction, visual question answering, object detection and segmentation, unsupervised representation learning and procedural generation

    Traffic3d:A rich 3D-traffic environment to train intelligent agents

    Get PDF
    The last few years marked a substantial development in the domain of Deep Reinforcement Learning. However, a crucial and not yet fully achieved objective is to devise intelligent agents which can be successfully taken out of the laboratory and employed in the real world. Intelligent agents that are successfully deployable in true physical settings, require substantial prior exposure to their intended environments. When this is not practical or possible, the agents benefit from being trained and tested on powerful test-beds, effectively replicating the real world. To achieve traffic management at an unprecedented level of efficiency, in this paper, we introduce a significantly richer new traffic simulation environment; Traffic3D. Traffic3D is a unique platform built to effectively simulate and evaluate a variety of 3D-road traffic scenarios, closely mimicking real-world traffic characteristics including faithful simulation of individual vehicle behavior, precise physics of movement and photo-realism. We discuss the merits of Traffic3D in comparison to state-of-the-art traffic-based simulators. Along with deep reinforcement learning, Traffic3D facilitates research across various domains such as object detection and segmentation, unsupervised representation learning, visual question answering, procedural generation, imitation learning and learning by interaction

    An efficient knowledge transfer solution to a novel SMDP formalization of a broker's decision problem

    Get PDF
    This paper introduces a new technique for optimizing the trading strategy of brokers that autonomously trade in re- tail and wholesale markets. Simultaneous optimization of re- tail and wholesale strategies has been considered by existing studies as intractable. Therefore, each of these strategies is optimized separately and their interdependence is generally ignored, with resulting broker agents not aiming for a glob- ally optimal retail and wholesale strategy. In this paper, we propose a novel formalization, based on a semi-Markov deci- sion process (SMDP), which globally and simultaneously op- timizes retail and wholesale strategies. The SMDP is solved using hierarchical reinforcement learning (HRL) in multi- agent environments. To address the curse of dimensionality, which arises when applying SMDP and HRL to complex de- cision problems, we propose an ecient knowledge transfer approach. This enables the reuse of learned trading skills in order to speed up the learning in new markets, at the same time as making the broker transportable across market envi- ronments. The proposed SMDP-broker has been thoroughly evaluated in two well-established multi-agent simulation en- vironments within the Trading Agent Competition (TAC) community. Analysis of controlled experiments shows that this broker can outperform the top TAC-brokers. More- over, our broker is able to perform well in a wide range of environments by re-using knowledge acquired in previously experienced settings

    An intelligent broker agent for energy trading:an MDP approach

    Get PDF
    This paper details the development and evaluation of AstonTAC, an energy broker that successfully participated in the 2012 Power Trading Agent Competition (Power TAC). AstonTAC buys electrical energy from the wholesale market and sells it in the retail market. The main focus of the paper is on the broker’s bidding strategy in the wholesale market. In particular, it employs Markov Decision Processes (MDP) to purchase energy at low prices in a day-ahead power wholesale market, and keeps energy supply and demand balanced. Moreover, we explain how the agent uses Non-Homogeneous Hidden Markov Model (NHHMM) to forecast energy demand and price. An evaluation and analysis of the 2012 Power TAC finals show that AstonTAC is the only agent that can buy energy at low price in the wholesale market and keep energy imbalance low
    corecore